Partitioning clustering algorithms for protein sequence data sets
نویسندگان
چکیده
منابع مشابه
Genetic Algorithms for Partitioning Sets
We first revisit a problem from the literature, that of partitioning a given set of numbers into subsets such that their sums are as nearly equal as possible. We devise a new genetic algorithm, Eager Breeder, for this problem. The algorithm is distinctive in its novel and aggressive way of extracting parental genetic material when forming a child partition, and its results are a substantial imp...
متن کاملEstimating Sequence Similarity from Read Sets for Clustering Sequencing Data
Clustering biological sequences is a central task in bioinformatics. The typical result of new-generation sequencers is a set of short substrings (“reads”) of a target sequence, rather than the sequence itself. To cluster sequences given only their read-set representations, one may try to reconstruct each one from the corresponding read set, and then employ conventional (dis)similarity measures...
متن کاملEmpirical Comparison of Fast Clustering Algorithms for Large Data Sets
Several fast algorithms for clustering very large data sets have been proposed in the literature. CLARA is a combination of a sampling procedure and the classical PAM algorithm, while CLARANS adopts a serial randomized search strategy to find the optimal set of medoids. GAC-R and GAC-RARw exploit genetic search heuristics for solving clustering problems. In this research, we conducted an empiri...
متن کاملPartitioning Sets with Genetic Algorithms
We first revisit a problem in the literature of genetic algorithms: arranging numbers into groups whose summed weights are as nearly equal as possible. We provide a new genetic algorithm which very aggressively breeds new individuals which should be improved groupings of the numbers. Our results improve upon those in the literature. Then we extend and generalize our algorithm to a related class...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BioData Mining
سال: 2009
ISSN: 1756-0381
DOI: 10.1186/1756-0381-2-3